A model of dynamic auditory perception and its application to robust word recognition

نویسندگان

  • Brian Strope
  • Abeer Alwan
چکیده

This paper describes two mechanisms that augment the common automatic speech recognition (ASR) front end and provide adaptation and isolation of local spectral peaks. A dynamic model consisting of a linear filterbank with a novel additive logarithmic adaptation stage after each filter output is proposed. An extensive series of perceptual forward masking experiments, together with previously reported forward masking data, determine the model’s dynamic parameters. Once parameterized, the simple exponential dynamic mechanism predicts the nature of forward masking data from several studies across wide ranging frequencies, input levels, and probe delay times. An initial evaluation of the dynamic model together with a local peak isolation mechanism as a front end for dynamic time warp (DTW) and hidden Markov model (HMM) word recognition systems shows an improvement in robustness to background noise when compared to Mel-frequency cepstral coefficients (MFCC), linear prediction cepstral coefficients (LPCC), and relative spectra (RASTA) based front ends.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Correlation between Auditory Spectral Resolution and Speech Perception in Children with Cochlear Implants

Background: Variability in speech performance is a major concern for children with cochlear implants (CIs). Spectral resolution is an important acoustic component in speech perception. Considerable variability and limitations of spectral resolution in children with CIs may lead to individual differences in speech performance. The aim of this study was to assess the correlation between auditory ...

متن کامل

سایکوآکوستیک و درک گفتار در افراد مبتلا به نوروپاتی شنوایی و افراد طبیعی

Background: The main result of hearing impairment is reduction of speech perception. Patient with auditory neuropathy can hear but they can not understand. Their difficulties have been traced to timing related deficits, revealing the importance of the neural encoding of timing cues for understanding speech. Objective: In the present study psychoacoustic perception (minimal noticeable differen...

متن کامل

Effect of signal to noise ratio on the speech perception ability of older adults

Background: Speech perception ability depends on auditory and extra-auditory elements. The signal-to-noise ratio (SNR) is an extra-auditory element that has an effect on the ability to normally follow speech and maintain a conversation. Speech in noise perception difficulty is a common complaint of the elderly. In this study, the importance of SNR magnitude as an extra-auditory effect on speech...

متن کامل

Effects of ageing on speed and temporal resolution of speech stimuli in older adults

 Background: According to previous studies, most of the speech recognition disorders in older adults are the results of deficits in audibility and auditory temporal resolution. In this paper, the effect of ageing on timecompressed speech and auditory temporal resolution by word recognition in continuous and interrupted noise was studied. Methods: A time-compressed speech test (TCST) w...

متن کامل

Verbal-Auditory Skills in 5-year-Old Children of Semnan/Iran in 2006

Introduction: This research was planned to determine some verbal-auditory skills (verbal-auditory short memory and phonological awareness) that have the closest relationship with speech and language development in 5-year-old children. Method: In this descriptive cross-sectional study, 400 children of pre-school classes affiliated to Education and Welfare organizations in Semnan city were select...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Speech and Audio Processing

دوره 5  شماره 

صفحات  -

تاریخ انتشار 1997